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THE VALUE TO ECONOMICS OF FORMAL 
STATISTICAL METHODS. 

By Carl J. West, Ph.D., Ohio State University. 



To afford an accurate form of summary statement of eco- 
nomic facts and changes, statistics must present the facts in 
such a way as to enable the mind to grasp them as a whole 
more readily and clearly. From this point of view the chief 
care of the statistician is to secure accurate and comprehensive 
field-work or counting. As a recorder and tabulator of eco- 
nomic data he can consider his work done either when each 
individual instance has been enumerated or when a definite 
estimate can be made of the per cent, of accuracy. 

But economics demands that statistics do more than serve 
as a sort of bookkeeper. It is only by a study of the statistics 
that causes and relations can be suggested and the basis laid 
for empirical laws. The complexity of our economic relations 
requires the economist to keep in constant and close touch 
with concrete facts. 

To what extent the prevalence of a certain disease depends 
on the climate or the season and to what extent on the state 
of sanitation can in general be determined only from an ex- 
tensive statistical investigation. The intricate questions con- 
cerning the rise and fall of the interest rate are largely matters 
of dependence among different series of statistical facts. A 
general theory of prices and the gold supply needs empirical 
verification at every point. The fluctuations of wages and the 
movement of retail and wholesale prices can not be adequately 
understood until better and more accurate data can be ob- 
tained. Immigration and business prosperity and depression, 
the consumption of alcohol and the presence of poverty are 
essentially questions of the effect which variations in one 
condition or characteristic produce in related attributes or 
conditions. 

These illustrations suggest the rather evident fact that the 
logic of most problems in economics is essentially the same. 
The ultimate aim is of course to detect and demonstrate causal 
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relations. The limitations or requirements of the problem may 
render it undesirable to attempt a more detailed formulation of 
the causal statement than of the type: If cause or event A is 
present then effect or event B will follow; or negatively, since 
A is not present or does not vary when B varies in value or 
degree of intensity, A can not be the cause of B. In general, 
however, the description of the causal relation can not be con- 
sidered satisfactory until it is possible to state in detail just 
what change in the effect will follow from certain definite 
changes in the cause. Thus causal relations fall naturally into 
two broad divisions according as the characteristics or attri- 
butes are accurately determined and measured in detail, or are 
not measured further than to enumerate the cases in which 
each is present or absent. 

For the purposes of statistical economics, characteristics or 
occurrences may be said to be causally related when, other 
things being equal, the presence of a definite amount or degree 
of the one is always accompanied by a corresponding amount 
or degree of the other; so that, in general, if one changes the 
other changes and if one is present and acting, a corresponding 
effect is to be noted in the other. 

The physicist, the chemist or the engineer can make direct 
use of this definition since it is often possible, within working 
limits, to hold all other conditions constant while the condi- 
tions under consideration are varied. The engineer can so 
arrange his experiments that discordant and irrelevant ele- 
ments can be avoided, as, for instance, when the distinct 
strains that a steel beam undergoes are reproduced in the 
laboratory and the effects measured. 

But the material of the economist in degree, at least, is 
radically different from that of the student of the so-called 
exact sciences. The data of the former is always heterogeneous 
and complex so that it is not possible to isolate the variations 
and observe their relations directly; neither can it be safely 
assumed that all other conditions are constant while the con- 
ditions studied vary or change. For these reasons the com- 
paratively simple and direct methods of those sciences will not 
apply to the solution of the problems of statistical economics. 
The social scientist requires methods for discovering and 
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demonstrating the presence of definite and uniform tendencies 
for variations in one condition or characteristic to depend on 
the changes in certain other conditions; that is, correlation 
methods which deal with measurements en masse rather than 
as individuals. 

Thus, to test the obvious fact that during the earlier years 
stature increases with age, the height of 1,000 individuals at 
ages ranging from 6 to 25 might be determined and the average 
height for each age computed. In the midst of the discon- 
certing variations due to lack of homogeneity in conditions of 
health, parentage, environment, posture, etc., the general ten- 
dency for tallness to accompany heaviness would be apparent 
in the data. 

For the treatment of the essentially mass-aggregate or group 
problems of statistics which have to do with collective and not 
with individual measurements, a body of theory having the 
definite and systematic form of the other mathematical 
sciences has been developed. Owing to its having been first de- 
veloped for the problems of biology, however, there is need in 
some respects for adaptation to the requirements of the social 
sciences. 

Every economist who makes use of concrete statistical facts 
must form collective judgments, must rely largely upon cor- 
relations to point out causal relations regardless of whether he 
consciously and formally makes use of the terminology and 
methods. As an illustration of a simple type of question which 
can not be answered by the use of informal methods take the 
following data of the Sheffield smallpox outbreak of 1887-1888 
as given by Dr. Macdonell :* 

Vaccination-Strength to Resist Smallpox when In- 
curred. 

Recoveries. Deaths. Total. 

Present 3,951 200 4,151 

Absent 278 274 552 



Total 4,229 474 4,703 

This table shows clearly that in this instance vaccination 
was highly effective in combating the disease. But sup- 

♦Elderton, Frequency Curves and Correlation, p. 125. 
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pose the same statistical material were reclassified on the basis 
of the presence or absence of a characteristic which we may- 
call "sanitary" and that the following distribution was ob- 
tained: 

"Sanitary "-Strength to Resist Smallpox when In- 
curred. 

Recoveries. Deaths. Total. 

Present 3,850 195 4,045 

Absent 379 279 658 

Total 4,229 474 4,703 

Apparently this measure or condition is about as effective 
as vaccination so that it is a matter of careful study to decide 
which has the higher efficiency; no casual method can be relied 
upon to yield a satisfactory answer. 

The inadequacy of informal methods may be further illus- 
trated by the difficulty of properly "smoothing" a series of 
measurements by generally loose methods. To know what vari- 
ations are accidental and what are significant requires first of all 
a thorough knowledge of the data, and to successfully eliminate 
the irrelevant or accidental elements without sacrificing the 
""uly significant variations considerable skill in highly tech- 
nical methods is necessary. The following table of the meas- 
urements of stature of a class of students furnishes material 
for an illustrative problem in "smoothing." 

Stature 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 

Frequency... 1 2 2 11 11 48 45 97 100 126 103 97 45 46 4 1 

In these measurements one would suspect for instance that 
the comparatively large numbers having stature of 72 and 70 
inches were not significant. The equal frequencies for 62 and 
63 inches are also to be noted. 

Economists have done little "smoothing" or other refining 
of their data for the reason that they have not attempted to 
utilize more than a very small part of the information it might be 
made to yield. Were it the object of this paper to discuss the 
value of individual methods and processes the idea of the 
"probable error" and of the various measures of correlation 
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would be illustrated. But enough has been given to show 
the general value and especially to suggest what is doubt- 
less the most important value of formal methods for econom- 
ics: that if economics is to make use of more than the most 
obvious statistical facts and relations it must employ methods 
adequate to the bringing out of the significance of the data. 

It is not meant to imply that an elaborate formula is always 
or even ordinarily essential to the demonstration of a conclu- 
sion in statistics. But the statistician and economist should 
be acquainted with the general methods in order to obtain the 
advantages of sharply defined technical concepts and of sys- 
tematic and generally accepted habits of thinking and ways of 
attacking a statistical problem. Only then can it be correctly 
decided, for example, when the most superficial methods of 
estimating causal connections are sufficient; when graphic 
methods give results with all the accuracy that the data war- 
rants or the problem in hand demands; and when it is advisable 
to employ more detailed and exact methods. Thus the eco- 
nomist has need for a science of statistics similar to the need of 
the biologist for chemistry and microscopic technique and of the. 
engineer for physics and mathematics. 

An extensive employment of statistical data means that the 
material must be collected as representative or typical data and 
not by complete enumerations. The expense and labor in- 
volved render the second method prohibitive, and besides few 
objects of economic inquiry lend themselves to the process of 
complete enumeration. Indeed, much can be said in favor of 
the argument that better data for the purposes of economics 
can be obtained by carefully selecting the material. Moreover, 
though the data may be gathered by an exhaustive process of 
counting it must be considered as typical, as a true pattern of 
what may be expected to occur again and again under similar 
conditions, if it is to be of value in establishing a principle or 
verifying a deduction. Reliable and effective work with typi- 
cal data can not be done without an extensive acquaintance 
with statistical theory. For this reason, if for no other, statis- 
tical economics can not be developed, to a significant extent 
until methods adapted to its peculiar needs have been worked 
out and popularized among economists. 
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It may well be asked what effect the introduction of more 
formal methods may tend to have on the quality of the em- 
pirical work of economics and on the effectiveness of economic 
work in general. 

Accuracy in the collection or production of statistical facts 
and accuracy in determining and stating the degree of con- 
fidence that can be placed in the data is a matter of fundamen- 
tal and vital importance for economics. The statistician who 
realizes how little has been done in this respect and how difficult 
it is to secure proper appreciation of the necessity for extreme 
carefulness and caution in accepting statistical material may 
well feel apprehensive of the effect of the introduction of new 
and apparently easy methods of deducing striking results. 
Perhaps the most suggestive way of estimating the probable 
influence of the extensive adoption of statistical methods on 
the quality of the data is by studying the conditions which 
somewhat similar circumstances have produced in other 
sciences. It is also of interest to note the working relations 
that have gradually grown up between the experimental and 
empirical elements on the one side and the so-called theoret- 
ical parts on the other. 

The development of profound mathematical methods in 
physics has not tended to lessen the accuracy of the laboratory 
work but to increase it. On the somewhat slender basis of 
Hertz's experiments, Maxwell produced his mathematical 
theory of electromagnetism and ether waves. On the basis 
of andtis a result of this theory, Marconi invented the wireless 
telegraph. This working together of theory and experiment 
along with the feeling that no result can be accepted until it 
has received both theoretical and experimental verification is 
of definite and positive significance for economics. If the two 
aspects can get on so well in the field of physical science, why 
not in the social sciences? 

Psychology is similar to economics in that it deals with data 
subject to large variations in the individual measurements so 
that only aggregate methods can be employed. There is much 
discussion among psychologists regarding certain points of 
method, but it is generally agreed that experimental results 
must be reduced before they become completely intelligible. 
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An important consideration from our point of view is that the 
introduction of somewhat complicated methods for determin- 
ing correlation and variation has stimulated both the produc- 
tion of experimental data and the critical discussion of such 
data, which can but result in better and more accurate ex- 
perimental work. Moreover, the results in psychology tend 
to show that not only is the accuracy of the observational 
work increased but also the science itself is greatly enriched by 
the introduction of formal methods of reducing the statistical 
data. 

The history of biology since the time of Darwin is especially 
instructive. Darwin, both by his example and by the stimu- 
lating influence of his work, gave great impetus to observa- 
tional methods in biology. The researches that have resulted 
consist essentially of studies in the comparative variations 
among different biological classes and of the inter-relations of 
such variations. These variations are often small so that 
appropriate and adequate methods of dealing with the peculiar 
problems of the data are imperative. Professor Pearson in 
his " Mathematical Contributions to the Theory of Evolution" 
developed the working rules and principles which have been 
almost universally adopted by statistical biologists. Since 
so much of this statistical work is largely routine in character 
the results have been satisfactory on the whole even though 
few biologists have the mathematical training to understand 
the formulas. 

However, the extensive employment in this mechanical 
fashion of highly developed but little understood methods 
very naturally has resulted in mistakes which if not so serious 
would be ridiculous in some instances. Using six- or seven- 
place logarithms with data subject to a high per cent, of error; 
smoothing curves by methods involving an overwhelming mass 
of arithmetic when a better curve could be obtained by simple 
graphic means; failing to realize that the "probable error" 
is a safe guide only for homogeneous data; losing sight 
of the necessary limitations of the theory of the coefficient of 
correlation, are only a few of the statistical sins which some 
biologists have committed in the name of scientific methods. 

Aside from the useless expenditure of labor, computing to 
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so much greater length than the data warrants would not be 
a matter of great concern were it not for the fact that such a 
show of accuracy is often positively misleading. The wide 
margin of error in the original data is presently forgotten and 
the results taken with all their apparent accuracy. 

A class of scientists trained, as are the biologists, in system- 
atic thinking would not be guilty of such loose reasoning if 
they thoroughly understood the methods they were using. 
Nothing but mere routine and that only when done under 
immediate and responsible supervision can be safely trusted 
to persons with inadequate preliminary training. The statis- 
tician should not make use of a formula or method until he 
thoroughly comprehends the assumptions on which it is based 
and until he knows the conditions and limits of its validity. 
And further, a formula should not be used unless the results 
derived by it can be clearly interpreted in terms of the initial data 
and conditions. The biologist and economist can safely call 
upon the mathematician to derive the formulas which make 
it possible to pass from the raw data to the finished result, 
but the mathematician can not always be trusted to estimate 
the accuracy in the raw data themselves or to tell what formulas 
are the most appropriate under the given circumstances; this 
absolutely essential part can be done only by one who is 
thoroughly conversant with the statistical material and who 
has at least a reasonably comprehensive idea of the methods. 

Thus the effect on the scientific character and value of the 
inductive or experimental work in those sciences in which 
there have been applications of formal statistical methods 
seems to give no ground for a fear that the quality of the 
statistical work in the social sciences will be lowered by careful 
and systematic use of more standardized methods. Aside from 
the support which a study of the history of the other sciences 
may lend to this conclusion, it is logically sound to expect that 
the character of statistical work will improve as more attention 
is given to technical principles. It is only by the building up 
of a body of systematic principles and methods that a subject 
of study can be raised to the dignity of a science. The mere 
fact of the existence of formal methods imparts a definiteness 
which tends to stimulate systematic thinking even though 
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the actual methods are not consciously or formally made use of. 
Besides there is always a gain in doing routine and detail work 
according to orderly, systematic, and generally accepted meth- 
ods. For instance, the consciousness that the methods em- 
ployed have been tried and proved produces a confidence in 
the results which can be obtained in no other way. The too 
frequently encountered opinion that one "can prove anything 
by statistics" is due partly to the lack of generally recognized 
methods of measuring the degree of connection between related 
series of events, and partly to the failure to critically value the 
reliability of the data, and much of this failure is due to the 
lack of simple but uniformly applicable methods of measuring 
such reliability. The non-technical person is quite ready to 
rely on the conclusions of specialists provided the specialists 
are reasonably agreed among themselves. But with methods 
as tentative and dependent on personal peculiarities as are the 
ordinary methods of statistics such an agreement is impossible. 

An increased appreciation of statistical work is bound to 
react favorably on social science in general. It is possible that, 
economics should become more professionalized than at present. 
Most individuals of average intelligence would assent to the 
^statement that the trained economist is better able to decide 
complicated economic questions than are they themselves, but 
if it came to a matter of personal concern it is doubtful whether 
the opinion of the specialist in economics would be held in such 
respect as would that of the physician or lawyer. While there- 
is no particular reason for thinking that formal statistical 
theory can or should become popular in the generally ac- 
cepted sense of the term so that anyone could make use of it, 
yet by giving to statistics, and hence to much of economics^ 
uniformity of method and quantitative definiteness, making 
possible more elaborate and thoroughgoing investigations, the 
science would become more professionalized with a consequent 
increased respect for economics on the part of the public. 

However, economics can not hope to escape the experimenta- 
tion of those who are fascinated by the possibilities of the 
newer developments in statistical methods but who do not 
adequately realize the inherent limitations of their data 
or who do not have sufficient acquaintance with the working 
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principles of the methods to use them discriminatory. To do 
trustworthy and effective work in statistical economics, the 
economist must be a statistician, and especially must under- 
stand the material from which the data are taken and must 
know the degree of confidence that the data warrant. To 
determine the accuracy of the data and to so analyze the 
numerical facts as to obtain the maximum amount of informa- 
tion from them requires an extensive training in somewhat 
complicated arithmetic and theory. 

But there are few economists with the mathematical equip- 
ment necessary for statistical purposes and still fewer mathe- 
maticians with an appreciation of the problems of the 
economist. The physicists have adjusted themselves to a 
similar situation so that most physicists know considerable 
mathematics and usually the mathematical student has turned 
to physical science for a minor study. While this arrangement 
has worked well for the science in question persons so trained 
are not particularly qualified to take up statistical investiga- 
tions. The material of the social sciences is so radically different 
from that of the physical sciences that it is extremely hard 
for the physicist, for instance, to adjust his habits of thinking 
to the new standards. It is very easy for five-place standards 
of accuracy to be in this way carried over into a field where 
the figures may often have a margin of error of several per 
cent. 

If students of mathematics are encouraged to take up eco- 
nomics and the more statistical parts of social sciences in 
general as secondary subjects, great improvements and simplifi- 
cations in statistical methods will be possible. Only in this 
way can the economist obtain the aid which experience in 
other sciences shows to be necessary. 

On the other hand, the student of economics must be better 
trained in mathematics; not so much in the material which 
makes up the greater part of the courses in mathematics as 
planned for engineering students as in courses having the needs 
of the social sciences more in view. Such a course should in- 
clude detailed practice in algebraic manipulation and in certain 
especially useful topics of analytic geometry and should lay 
emphasis on the subject of probability. It is no doubt ad- 
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visable to have a separate course for the more technical topics 
of statistical methods such as the smoothing of data, curve 
fitting, measures of accuracy, correlation, etc. With these 
two courses students not desiring to specialize in statistics 
can obtain a fairly comprehensive knowledge of statistical 
methods by taking the technical course only, while the student 
expecting to make considerable use of statistics would of 
course need the more extensive mathematical training. 

The relative extent of these two courses should probably 
vary to accord with local conditions and requirements. It 
seems quite certain, however, that the practice of devoting a 
few lectures to the more technical phases of statistical methods 
in connection with courses in economic statistics can not be 
productive of results of great value, because such courses are 
ordinarily given by persons primarily interested in some phases 
of economics and consequently not likely to have great interest 
in so characteristically formal a study as statistical meth- 
ods or mathematics; and because the subject is too extensive 
and complicated for so brief a presentation. The slight ac- 
quaintance gained in this way may indeed be a positive detri- 
ment if it does not impress the student with the extent and 
difficulty of the theory and with the necessity for extreme 
care and caution in its application. It is easy to lose sight 
of the fact that a discriminating statistical judgment can be 
attained only by long training and practice. 

It would seem therefore that the most desirable arrange- 
ment is for the course in statistical methods to be given by 
an instructor who is especially interested in methods and for- 
mal theories and who has had the benefits of an appropriate 
mathematical training, and for this course to be followed 
by the courses in economic statistics and the other courses in 
social science which make use of statistics and in which formal 
methods can be employed to advantage. 



